智能论文笔记

Deep metric learning improves lab of origin prediction of genetically engineered plasmids

Igor M. Soares , Fernando H. F. Camargo , Adriano Marques , Oliver M. Crook

分类：机器学习 | 人工智能 | 神经与进化计算

2021-11-24

基因组工程正在进行前所未有的发展，现在已广泛可用。为确保负责任的生物技术创新并减少滥用工程DNA序列，为识别工程型质粒的起源实验室来说是至关重要的。基因工程归因（GEA），制定序列实验室协会的能力将支持这一过程中的法医专家。在这里，我们提出了一种基于度量学习的方法，该方法将最可能的原产实验室排名，同时为质粒序列和实验室产生嵌入。这些嵌入物可用于执行各种下游任务，例如聚类DNA序列和实验室，以及在机器学习模型中使用它们作为特征。我们的方法采用了循环转移增强方法，能够在前10个预测中正确地将原产于原产的90亿美元的时间排列 - 优于所有最新的最先进的方法。我们还证明我们可以使用只需10次\％$ 10 \％$ of序列进行几次拍摄学习并获得76±10美元的准确性。这意味着，我们仅使用第十个数据表达先前的CNN方法。我们还证明我们能够在特定实验室中提取质粒序列中的关键签名，允许对模型的产出进行可解释的检查。

translated by 谷歌翻译

A Physics-Informed Neural Network to Model Port Channels

Marlon S. Mathias , Marcel R. de Barros , Jefferson F. Coelho , Lucas P. de Freitas , Felipe M. Moreno , Caio F. D. Netto , Fabio G. Cozman , Anna H. R. Costa , Eduardo A. Tannuri , Edson S. Gomi

分类：机器学习

2022-12-20

We describe a Physics-Informed Neural Network (PINN) that simulates the flow induced by the astronomical tide in a synthetic port channel, with dimensions based on the Santos - S\~ao Vicente - Bertioga Estuarine System. PINN models aim to combine the knowledge of physical systems and data-driven machine learning models. This is done by training a neural network to minimize the residuals of the governing equations in sample points. In this work, our flow is governed by the Navier-Stokes equations with some approximations. There are two main novelties in this paper. First, we design our model to assume that the flow is periodic in time, which is not feasible in conventional simulation methods. Second, we evaluate the benefit of resampling the function evaluation points during training, which has a near zero computational cost and has been verified to improve the final model, especially for small batch sizes. Finally, we discuss some limitations of the approximations used in the Navier-Stokes equations regarding the modeling of turbulence and how it interacts with PINNs.

translated by 谷歌翻译

Berlin V2X: A Machine Learning Dataset from Multiple Vehicles and Radio Access Technologies

Rodrigo Hernangómez , Philipp Geuer , Alexandros Palaios , Daniel Schäufele , Cara Watermann , Khawla Taleb-Bouhemadi , Mohammad Parvini , Anton Krause , Sanket Partani , Christian Vielhaus

分类：机器学习 | 人工智能

2022-12-20

The evolution of wireless communications into 6G and beyond is expected to rely on new machine learning (ML)-based capabilities. These can enable proactive decisions and actions from wireless-network components to sustain quality-of-service (QoS) and user experience. Moreover, new use cases in the area of vehicular and industrial communications will emerge. Specifically in the area of vehicle communication, vehicle-to-everything (V2X) schemes will benefit strongly from such advances. With this in mind, we have conducted a detailed measurement campaign with the purpose of enabling a plethora of diverse ML-based studies. The resulting datasets offer GPS-located wireless measurements across diverse urban environments for both cellular (with two different operators) and sidelink radio access technologies, thus enabling a variety of different studies towards V2X. The datasets are labeled and sampled with a high time resolution. Furthermore, we make the data publicly available with all the necessary information to support the on-boarding of new researchers. We provide an initial analysis of the data showing some of the challenges that ML needs to overcome and the features that ML can leverage, as well as some hints at potential research studies.

translated by 谷歌翻译

Managing Large Dataset Gaps in Urban Air Quality Prediction: DCU-Insight-AQ at MediaEval 2022

Dinh Viet Cuong , Phuc H. Le-Khac , Adam Stapleton , Elke Eichlemann , Mark Roantree , Alan F. Smeaton

分类：机器学习 | 人工智能

2022-12-19

Calculating an Air Quality Index (AQI) typically uses data streams from air quality sensors deployed at fixed locations and the calculation is a real time process. If one or a number of sensors are broken or offline, then the real time AQI value cannot be computed. Estimating AQI values for some point in the future is a predictive process and uses historical AQI values to train and build models. In this work we focus on gap filling in air quality data where the task is to predict the AQI at 1, 5 and 7 days into the future. The scenario is where one or a number of air, weather and traffic sensors are offline and explores prediction accuracy under such situations. The work is part of the MediaEval'2022 Urban Air: Urban Life and Air Pollution task submitted by the DCU-Insight-AQ team and uses multimodal and crossmodal data consisting of AQI, weather and CCTV traffic images for air pollution prediction.

translated by 谷歌翻译

Improving Pre-Trained Weights Through Meta-Heuristics Fine-Tuning

Gustavo H. de Rosa , Mateus Roder , João Paulo Papa , Claudio F. G. dos Santos

分类：人工智能

2022-12-19

Machine Learning algorithms have been extensively researched throughout the last decade, leading to unprecedented advances in a broad range of applications, such as image classification and reconstruction, object recognition, and text categorization. Nonetheless, most Machine Learning algorithms are trained via derivative-based optimizers, such as the Stochastic Gradient Descent, leading to possible local optimum entrapments and inhibiting them from achieving proper performances. A bio-inspired alternative to traditional optimization techniques, denoted as meta-heuristic, has received significant attention due to its simplicity and ability to avoid local optimums imprisonment. In this work, we propose to use meta-heuristic techniques to fine-tune pre-trained weights, exploring additional regions of the search space, and improving their effectiveness. The experimental evaluation comprises two classification tasks (image and text) and is assessed under four literature datasets. Experimental results show nature-inspired algorithms' capacity in exploring the neighborhood of pre-trained weights, achieving superior results than their counterpart pre-trained architectures. Additionally, a thorough analysis of distinct architectures, such as Multi-Layer Perceptron and Recurrent Neural Networks, attempts to visualize and provide more precise insights into the most critical weights to be fine-tuned in the learning process.

translated by 谷歌翻译

Transformer-based normative modelling for anomaly detection of early schizophrenia

Pedro F Da Costa , Jessica Dafflon , Sergio Leonardo Mendes , João Ricardo Sato , M. Jorge Cardoso , Robert Leech , Emily JH Jones , Walter H. L. Pinaya

分类：机器学习 | 人工智能

2022-12-08

Despite the impact of psychiatric disorders on clinical health, early-stage diagnosis remains a challenge. Machine learning studies have shown that classifiers tend to be overly narrow in the diagnosis prediction task. The overlap between conditions leads to high heterogeneity among participants that is not adequately captured by classification models. To address this issue, normative approaches have surged as an alternative method. By using a generative model to learn the distribution of healthy brain data patterns, we can identify the presence of pathologies as deviations or outliers from the distribution learned by the model. In particular, deep generative models showed great results as normative models to identify neurological lesions in the brain. However, unlike most neurological lesions, psychiatric disorders present subtle changes widespread in several brain regions, making these alterations challenging to identify. In this work, we evaluate the performance of transformer-based normative models to detect subtle brain changes expressed in adolescents and young adults. We trained our model on 3D MRI scans of neurotypical individuals (N=1,765). Then, we obtained the likelihood of neurotypical controls and psychiatric patients with early-stage schizophrenia from an independent dataset (N=93) from the Human Connectome Project. Using the predicted likelihood of the scans as a proxy for a normative score, we obtained an AUROC of 0.82 when assessing the difference between controls and individuals with early-stage schizophrenia. Our approach surpassed recent normative methods based on brain age and Gaussian Process, showing the promising use of deep generative models to help in individualised analyses.

translated by 谷歌翻译

Bayesian Simultaneous Factorization and Prediction Using Multi-Omic Data

Sarah Samorodnitsky , Chris H. Wendt , Eric F. Lock

分类： (统计)机器学习

2022-11-29

Understanding of the pathophysiology of obstructive lung disease (OLD) is limited by available methods to examine the relationship between multi-omic molecular phenomena and clinical outcomes. Integrative factorization methods for multi-omic data can reveal latent patterns of variation describing important biological signal. However, most methods do not provide a framework for inference on the estimated factorization, simultaneously predict important disease phenotypes or clinical outcomes, nor accommodate multiple imputation. To address these gaps, we propose Bayesian Simultaneous Factorization (BSF). We use conjugate normal priors and show that the posterior mode of this model can be estimated by solving a structured nuclear norm-penalized objective that also achieves rank selection and motivates the choice of hyperparameters. We then extend BSF to simultaneously predict a continuous or binary response, termed Bayesian Simultaneous Factorization and Prediction (BSFP). BSF and BSFP accommodate concurrent imputation and full posterior inference for missing data, including "blockwise" missingness, and BSFP offers prediction of unobserved outcomes. We show via simulation that BSFP is competitive in recovering latent variation structure, as well as the importance of propagating uncertainty from the estimated factorization to prediction. We also study the imputation performance of BSF via simulation under missing-at-random and missing-not-at-random assumptions. Lastly, we use BSFP to predict lung function based on the bronchoalveolar lavage metabolome and proteome from a study of HIV-associated OLD. Our analysis reveals a distinct cluster of patients with OLD driven by shared metabolomic and proteomic expression patterns, as well as multi-omic patterns related to lung function decline. Software is freely available at https://github.com/sarahsamorodnitsky/BSFP .

translated by 谷歌翻译

MONAI: An open-source framework for deep learning in healthcare

M. Jorge Cardoso , Wenqi Li , Richard Brown , Nic Ma , Eric Kerfoot , Yiheng Wang , Benjamin Murrey , Andriy Myronenko , Can Zhao , Dong Yang

分类：机器学习 | 人工智能 | 计算机视觉

2022-11-04

Artificial Intelligence (AI) is having a tremendous impact across most areas of science. Applications of AI in healthcare have the potential to improve our ability to detect, diagnose, prognose, and intervene on human disease. For AI models to be used clinically, they need to be made safe, reproducible and robust, and the underlying software framework must be aware of the particularities (e.g. geometry, physiology, physics) of medical data being processed. This work introduces MONAI, a freely available, community-supported, and consortium-led PyTorch-based framework for deep learning in healthcare. MONAI extends PyTorch to support medical data, with a particular focus on imaging, and provide purpose-specific AI model architectures, transformations and utilities that streamline the development and deployment of medical AI models. MONAI follows best practices for software-development, providing an easy-to-use, robust, well-documented, and well-tested software framework. MONAI preserves the simple, additive, and compositional approach of its underlying PyTorch libraries. MONAI is being used by and receiving contributions from research, clinical and industrial teams from around the world, who are pursuing applications spanning nearly every aspect of healthcare.

translated by 谷歌翻译

Jubileo: An Open-Source Robot and Framework for Research in Human-Robot Social Interaction

Jair A. Bottega , Victor A. Kich , Alisson H. Kolling , Jardel D. S. Dyonisio , Pedro L. Corçaque , Rodrigo da S. Guerra , Daniel F. T. Gamarra

分类：机器人

2022-09-27

人类机器人相互作用（HRI）对于在日常生活中广泛使用机器人至关重要。机器人最终将能够通过有效的社会互动来履行人类文明的各种职责。创建直接且易于理解的界面，以与机器人开始在个人工作区中扩散时与机器人互动至关重要。通常，与模拟机器人的交互显示在屏幕上。虚拟现实（VR）是一个更具吸引力的替代方法，它为视觉提示提供了更像现实世界中看到的线索。在这项研究中，我们介绍了Jubileo，这是一种机器人的动画面孔，并使用人类机器人社会互动领域的各种研究和应用开发工具。Jubileo Project不仅提供功能齐全的开源物理机器人。它还提供了一个全面的框架，可以通过VR接口进行操作，从而为HRI应用程序测试带来沉浸式环境，并明显更好地部署速度。

translated by 谷歌翻译

Learn what matters: cross-domain imitation learning with task-relevant embeddings

Tim Franzmeyer , Philip H. S. Torr , João F. Henriques

分类：人工智能

2022-09-24

我们研究自主代理如何学会从不同领域（例如不同环境或不同代理）中的示范中执行任务。这样的跨域模仿学习需要例如从人类专家的演示中培训人造代理。我们提出了一个可扩展的框架，该框架可以实现跨域模仿学习，而无需访问其他演示或进一步的领域知识。我们共同培训学习者的政策，并通过对抗性培训学习学习者和专家领域的映射。我们通过使用共同信息标准来找到包含与任务相关的信息的专家状态空间的嵌入，并且对域细节不变。此步骤大大简化了估计学习者和专家领域之间的映射，因此有助于端到端学习。我们证明了在相当不同的域之间成功转移了政策，而没有额外的示范，以及其他方法失败的情况。

translated by 谷歌翻译